A Software Toolkit for Sharing and Accessing Corpora Over the Internet

نویسنده

  • Saturnino Luz
چکیده

This paper describes the Translational English Corpus (TEC) and the software tools developed in order to enable the use of the corpus remotely, over the internet. The model underlying these tools is based on an extensible client-server architecture implemented in Java. We discuss the data and processing constraints which motivated the TEC architecture design and its impact on the efficiency and scalability of the system. We also suggest that the kind of distributed processing model adopted in TEC could play a role in fostering the availability of corpus linguistic resources to the research community.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MobileIoT Toolkit: Connecting the EPC Network to MobilePhones

In this paper we discuss the MobileIoT Toolkit. This software framework offers a number of tools to ease the design and implementation of Java Mobile application prototypes interacting with the Internet of Things. In particular, we focus on mobile phones accessing a “standardized Internet of Things”, known as the EPC Network (Electronic Product Code). In this paper we introduce the EPC Network ...

متن کامل

Building Large Corpora from the Web Using a New Efficient Tool Chain

Over the last decade, methods of web corpus construction and the evaluation of web corpora have been actively researched. Prominently, the WaCky initiative has provided both theoretical results and a set of web corpora for selected European languages. We present a software toolkit for web corpus construction and a set of siginificantly larger corpora (up to over 9 billion tokens) built using th...

متن کامل

CollabCAD: A Toolkit for Integrated Synchronous and Asynchronous Sharing of CAD Applications

We are developing CollabCAD, a novel software architecture and toolkit, that supports sharing of arbitrary user-de ned objects or applications over intranets and the internet. Developers can use CollabCAD to rapidly re-engineer existing CAD applications to be collaboration-capable or build new collaboration-capable CAD applications. CollabCAD provides the following functionalities: 1. Support f...

متن کامل

An Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things

Internet of things (IoTs) is the newfound information architecture based on the internet that develops interactions between objects and services in a secure and reliable environment. As the availability of many smart devices rises, secure and scalable mass storage systems for aggregate data is required in IoTs applications. In this paper, we propose a new method for storing aggregate data in Io...

متن کامل

Collecting Voices from the Cloud

The collection and transcription of speech data is typically an expensive and time-consuming task. Voice over IP and cloud computing are poised to greatly reduce this impediment to research on spoken language interfaces in many domains. This paper documents our efforts to deploy speech-enabled web interfaces to large audiences over the Internet via Amazon Mechanical Turk, an online marketplace ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000